Lecture Notes: Introduction to R, RStudio and RMarkdown-V2.0

Henry Mendoza Rivera and Gloria R. Bautista Mendoza

August 26, 2020

Content

  1. Why R?

    • R installation
  2. RStudio

    • Rstudio Desktop installation

    • Rstudio Cloud

  3. RMarkdown

    • Creating a RMarkdown Document in RStudio Desktop

    • How to insert Images in RMarkdown file (html)

    • How to insert Tables RMarkdown file (html)

  4. Installing packages in R

    • What is a Package in R?

    • How to install packages in R

    • Installing multiples packages

    • Where the packages are located?

Why R?

R installation

Step 1:

Go to https://www.r-project.org/

Step 2:

Select the CRAN Mirror (Web server where R is installed and ready to download)

Step 3:

Choose the link appropriate for your computer (Linux \(\color{blue}{\text{1}}\), Mac \(\color{blue}{\text{2}}\), and Windows \(\color{blue}{\text{3}}\) ) and follow the download instructions given. Accept all default options.

Step 3.1:

Download R for Mac OS

Step 3.2: Installing R in Windows

Procedure

R installation

With R installed, the next step is to Install RStudio.

RStudio

RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. https://rstudio.com/

Rstudio Desktop installation

Step 1: Click on download RStudio.

Step 2: Click on Download RStudio Desktop

Step 3: Click on the file as indicated in \(\color{blue}{\text{1}}\) to download RStudio for Mac and click on the file as indicated on \(\color{blue}{\text{2}}\) for Windows

Step 4: Follow the installation steps. Accept all by default.

Procedure

Rstudio installation

Rstudio Cloud

RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online

  1. Go to R Studio Cloud

  2. Click on Get Started for free

  3. Click on Sign Up to create an account.

  4. Sign in with your user and password

  5. Once you are inside RStudio Cloud, Click on New project. Now, you can use RStudio cloud similar to how you use the RStudio Desktop.

RMarkdown

Markdown is a lightweight markup language with plain-text-formatting syntax, created in 2004 by John Gruber with Aaron Swartz. We will use RMarkdown to write Homework solutions and any other report document during the course.

Once you are in RStudio, then you can write the RMarkdown document required for your homework submission.

Creating a RMarkdown Document in RStudio Desktop

Step 1: Creating and organizing folders in the desktop

  1. Create a folder Stat371 in the Desktop.
  2. Create a subfolder Data inside the folder Stat371.
  3. Create a subfolder Graphs inside the folder Stat371.

Step 2: Set your working Directory

The working directory is a file path on your computer that sets the default location of any files you read into R. Set your working directory as follow

  1. Go to the menu and click on Session. Then select Set Working Directory and click on Choose Directory
  1. Now go to your Desktop and select the folder Stat371 and click in open

Step 3: Create a new Rmarkdown document

  1. Open RStudio (check first if you already installed the software R and then RStudio. In case you do not have installed R and RStudio. Go to Modules>Orientation>Introduction-to-R.html.

  2. Go to the menu and click on file. Then select New File and RMarkdown

  1. Select Document, and HTML. Then complete the information about title and name.

Title: Homework 1

Name: your name and Last Name.

  1. Delete everything below \(\color{blue}{\text{## R Markdown }}\)

  1. Replace \(\color{blue}{\text{##R Markdown}}\) by \(\color{blue}{\text{# Situation 1}}\) or any other title.

Step 4: Write your document

Suppose that in your homework the first situation to solve looks like

Situation 1

Find and interpret the mean of the following data

160, 170, 175, 148, 175, 185, 190, 145, 162

Then you should proceed as follow

  1. Copy and paste the situation statement from your homework document below \(\color{blue}{\text{# Situation 1}}\). Then Write \(\color{blue}{\text{## Solution}}\)
  1. Insert a Chunk: Click in \(\color{blue}{\textbf{1}}\) and then click in \(\color{blue}{\textbf{2}}\). See figure below.

  1. Use the c() function (where c mean “combine”) to combine numbers into a vector. Copy and paste the information below in your RMarkdown document
Mydata.weight = c(160, 170, 175, 148, 175, 185, 190, 145, 162) # weight in pounds
mean(Mydata.weight) # find the mean (average) of the weight
## [1] 167.7778

  1. Run the code inside the chunk as follow

or go to \(\color{blue}{\textbf{1}}\) and then \(\color{blue}{\textbf{2}}\) as indicate in the graph below. It works for Mac and Windows

  1. Knit your document to get the html document
  1. Write the interpretation

Interpretation: the average weight on a person is 167.8 lbs.

\(\color{red}{\Large\textbf{Knit }}\) again your document every time you add a new piece of information.

How to insert Images in RMarkdown file (html)

We can insert images in the document. To insert an image:

  1. Place it in your folder Graphs

  2. Outside a code chunk write:

![](Graphs/path_to_your_image.jpg)

  1. To add an alt text to your image, add it between the square brackets []:

![alt text here](Graphs/path_to_your_image.jpg)

How to insert Tables RMarkdown file (html)

Elaborate the below table using the RMarkdown tables generator.

Group n Min Max Mean SD 1st Q Median 3rd Q
Dead 10 17.65 24.59 20.79 2.22 19.74 20.59 21.72
Live 12 18.95 27.14 23.16 2.76 20.92 23.16 25.17

Example

I recommend to use: https://www.tablesgenerator.com/markdown_tables to generate a table as below. The following will help you to edit the table.

  1. Click in Column to insert a column

  2. Click in Row to insert a row.

  3. Insert the information in the cells

  4. Click in Column->text Align. Then select the type of text align required

  5. To delete a column click in Column->Remove

  6. To delete a row click in Row->Remove

  7. To Insert a LaTeX symbol, place in the cell and write the LaTeX expression. For example, $\mu$

Other type of table is the ANOVA Table

Elaborate the below table using the RMarkdown tables generator

Source of Variation Df Sum Sq Mean Sq F Value Pr(>F)
Treat (between) \(df_{Trt} = t-1\) SSTrt \(MSTrt =\frac{ SSTrt}{df_{Trt}}\) \(F =\frac{MSTrt}{MSE}\) \(p = P(F_{df_{Trt}, df_{E}} > F)\)
Error (within) \(df_{E} = N - t\) SSE \(MSE =\frac{SSE}{df_{E}}\)
Total \(df_{Tot} = N - 1\) SSTot

Example

Reading Data set in R

Before Reading a data set in R, make sure you create the Data folder as explained on the \(\color{blue}{\text{Creating a RMarkdown}}\) section. We are going to read into R the data set lbw

Data set lbw from the Low Birth Weight Study

  1. Go to Canvas in Modules>Students Resources>lbw.csv and download the lbw data set and save in the subfolder Data inside the folder Stat371.

  2. Set your Working Directory. The working directory is a file path on your computer that sets the default location of any files you read into R. Set your working directory as follow

    • Go to the menu and click on Session. The select Set Working Directory and click on Choose Directory
+ Now go to your *Desktop* and select the folder *Stat371* and click in *open*
  1. Insert a Chunk: Click in \(\color{blue}{\textbf{1}}\) and then click in \(\color{blue}{\textbf{2}}\)

  1. Copy and paste the code below inside the Chunk
lbw<-read.csv2("Data/lbw.csv",sep=",", dec = ".")
  1. Run the code as follow

    • Option 1:

Go to \(\color{blue}{\textbf{1}}\) and then \(\color{blue}{\textbf{2}}\) as indicate in the graph below. It works for Mac and Windows

  1. Check if the variables are read properly. We will use the R function str(). Add inside the chunk str(lbw) as follow.
lbw<-read.csv2("Data/lbw.csv",sep=",", dec = ".")
str(lbw)
## 'data.frame':    189 obs. of  10 variables:
##  $ low  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ smoke: int  0 0 1 1 1 0 0 0 1 1 ...
##  $ race : int  2 3 1 1 1 3 1 3 1 1 ...
##  $ age  : int  19 33 20 21 18 21 22 17 29 26 ...
##  $ lwt  : int  182 155 105 108 107 124 118 103 123 113 ...
##  $ ptl  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ ht   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ ui   : int  1 0 0 1 1 0 0 0 0 0 ...
##  $ ftv  : int  0 3 1 2 0 0 1 1 1 0 ...
##  $ bwt  : int  2523 2551 2557 2594 2600 2622 2637 2637 2663 2665 ...

or go to \(\color{blue}{\textbf{1}}\) and then \(\color{blue}{\textbf{2}}\) as indicate in the graph below. It works for Mac and Windows

Notice that R read the variables: low, smoke, race, ht,and ui, as a integer class. Let’s change them into factor class (categorical variables). To do this, we use the function as.factor as follow:

lbw<-read.csv2("Data/lbw.csv",sep=",", dec = ".")
lbw$low<-as.factor(lbw$low)
lbw$smoke<-as.factor(lbw$smoke)
lbw$race<-as.factor(lbw$race)
lbw$ht<-as.factor(lbw$ht)
lbw$ui<-as.factor(lbw$ui)

Now run again str(lbw)

str(lbw)
## 'data.frame':    189 obs. of  10 variables:
##  $ low  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ smoke: Factor w/ 2 levels "0","1": 1 1 2 2 2 1 1 1 2 2 ...
##  $ race : Factor w/ 3 levels "1","2","3": 2 3 1 1 1 3 1 3 1 1 ...
##  $ age  : int  19 33 20 21 18 21 22 17 29 26 ...
##  $ lwt  : int  182 155 105 108 107 124 118 103 123 113 ...
##  $ ptl  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ ht   : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ ui   : Factor w/ 2 levels "0","1": 2 1 1 2 2 1 1 1 1 1 ...
##  $ ftv  : int  0 3 1 2 0 0 1 1 1 0 ...
##  $ bwt  : int  2523 2551 2557 2594 2600 2622 2637 2637 2663 2665 ...

5. Installing packages in R

What is a Package in R?

help(package = "ggplot2")

Where the packages are located?

Packages are located in

How to install packages in R

For example, to use the ggplot2 package in R, follow the steps:

Step 1:

Install the package ggplot2. Use the R function install.packages() as follow (place the package name in quotes). Before running the code below, delete the # symbol in the below code. Then run the code

# install.package("ggplot2")

Step 2:

Load the package into R. with the R function library(). (Do not put the package name in quotes)

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.2

Should appear in the console a blue greater than symbol (\(>\)).

Step 3:

Add the # symbol to the code in step 1 (it becomes a green color the code line green in RMarkdown chunk)

Installing multiples packages

Run the below code without the #,install.packages(c("ggplot2","ggpubr")) in the first line and after you run it, again add the # to the first line #install.packages(c("ggplot2","ggpubr"))

#install.packages(c("ggplot2","ggpubr"))
library(ggplot2,ggpubr)